Executable sizes

The simplest Haskell program is main = pure (). It does nothing. It can be compiled to an executable file with GHC – for example, with GHC 9.12.2, by commanding:

This command will output Main.hi, Main.o and Main.exe – the last being the executable file. We can find its size in bytes by commanding (in PowerShell):

About 10.15 million bytes is not small, for an executable that does nothing.

Linking

By making GHC verbose, with its -v flag, and looking for -l options during the linking step, it is possible to identify what is linked in producing Main.exe:

On Windows, Stack comes with certain tools, provided by the MSYS2 project. They include the Cygwin project’s ldd.exe tool:

We can identify what is actually dynamically loaded by commanding:

and examining the results.

Microsoft also provides tool dumpbin.exe (the Microsoft COFF Binary File Dumper) as part of Build Tools for Visual Studio 2022. We can identify the DLLs that are actually imported to Main.exe using the command:

They are reflected in the comments above.

Executable files

On Windows, valid executable files are in the Portable Executable (PE) format. The PE format extends the Common Object File Format (COFF).

A COFF header specifies the number of sections and is followed by the sections table (a sequence of section headers). A section header uses 32 bits for flags that specify the characteristics of the section, as set out in the table below.

Value (hexadecimal)Flag name (prefixed by IMAGE_SCN_)Description
00000001TYPE_REG**Reserved for future use.
00000002TYPE_DSECT**Reserved for future use.
00000004TYPE_NO_LOAD**Reserved for future use.
00000008TYPE_NO_PADObsolete.*
00000010TYPE_COPY**Reserved for future use.
00000020CNT_CODEThe section contains executable code.
00000040CNT_INITIALIZED_DATAThe section contains initialized data.
00000080CNT_UNINITIALIZED_ DATAThe section contains uninitialized data.
00000100LNK_OTHERReserved for future use.
00000200LNK_INFOThe section contains comments or other information.*
00000400TYPE_OVER**Reserved for future use.
00000800LNK_REMOVEThe section will not become part of the image.*
00001000LNK_COMDATThe section contains COMDAT data.*
00008000GPRELThe section contains data referenced through the global pointer.
00010000MEM_PURGEABLEReserved for future use.
00020000MEM_16BITReserved for future use.
00040000MEM_LOCKEDReserved for future use.
00080000MEM_PRELOADReserved for future use.
00100000 .. 00E00000ALIGN_1BYTES to ALIGN_8192BYTESAlign data on a specified byte boundary.*
01000000LNK_NRELOC_OVFLThe section contains extended relocations.
02000000MEM_DISCARDABLEThe section can be discarded as needed.
04000000MEM_NOT_CACHEDThe section cannot be cached.
08000000MEM_NOT_PAGEDThe section is not pageable.
10000000MEM_SHAREDThe section can be shared in memory.
20000000MEM_EXECUTEThe section can be executed as code.
40000000MEM_READThe section can be read.
80000000MEM_WRITEThe section can be written to.

* Flag is valid only for object files. ** Deduced from .NET SectionCharacteristics enumeration documentation.

On Windows, GHC comes with certain tools. They include objdump.exe:

This is the LLVM tool llvm-objdump. The -h flag displays summaries of the headers for each section. We can command:

objdump translates combinations of COFF section characteristics as a ‘type’, such as TEXT, DATA or DEBUG.

The largest sections of Main.exe are .text (about 4.87 million bytes; 0x4a59c6) and .data (about 1.34 million bytes; 0x146200).

Stripping

The tools that come with GHC on Windows include strip.exe:

The LLVM tool llvm-strip is intended to work as a drop-in replacement for GNU’s strip. Running it without options is the equivalent of running it with flag --strip-all. For COFF files, --strip-all removes all symbols, debug sections and relocations from the output. The tool modifies the input file in-place.

The default --enable-executable-stripping flag of Cabal (the library) causes Cabal’s copy and install commands to seek run strip on the installed executable file.

Stripping has removed all the DEBUG sections:

Although smaller, about 6.80 million bytes is still not small for a program that does nothing.

GHC 8.10.7

The output of GHC 8.10.7 was smaller. We can compare:

In the case of GHC 8.10.7, objdump.exe is the GNU project’s tool.